On the Law of Zipf-Mandelbrot for Multi-Wort Phrases

نویسنده

  • Leo Egghe
چکیده

The paper studies the probabilities of the occurrence of m word phrases (m=2,3, ...) in relation with the probabilities of occurrence of the single words. It is well-known that, in the latter case, the law of Zipf is valid (i.e. a power law). We prove that in the case of m word phrases (m22) this is not the case. We present two independent proofs of this. We furthermore show that in case we want to approximate the found distribution by Zipfs law we obtain exponents p, in this power law for which the sequence (P,),, is strictly decreasing. This explains experimental findings of Smith and Devine, Hilberg and Meyer. 'Permanent address Acknowledgement The author is grateful to Prof Dr. R. Rousseau for interesting discussions on the topic of this paper. 2 Our results should be compared with a heuristic finding of Rousseau who states that the law of Zipf-Mandelbrot is valid for multi-word phrases. He, however, uses other less classical assumptions than we do.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extension of Zipf's Law to Words and Phrases

Zipf’s law states that the frequency of word tokens in a large corpus of natural language is inversely proportional to the rank. The law is investigated for two languages English and Mandarin and for ngram word phrases as well as for single words. The law for single words is shown to be valid only for high frequency words. However, when single word and n-gram phrases are combined together in on...

متن کامل

Beyond the Zipf-Mandelbrot law in quantitative linguistics

In this paper the Zipf-Mandelbrot law is revisited in the context of linguistics. Despite its widespread popularity the Zipf–Mandelbrot law can only describe the statistical behaviour of a rather restricted fraction of the total number of words contained in some given corpus. In particular, we focus our attention on the important deviations that become statistically relevant as larger corpora a...

متن کامل

Is space a word, too?

For words, rank-frequency distributions have long been heralded for adherence to a potentiallyuniversal phenomenon known as Zipf’s law. The hypothetical form of this empirical phenomenon was refined by Ben̂ıot Mandelbrot to that which is presently referred to as the Zipf-Mandelbrot law. Parallel to this, Herbert Simon proposed a selection model potentially explaining Zipf’s law. However, a signi...

متن کامل

On a General Theorem of Number Theory Leading to the Gibbs, Bose–Einstein, and Pareto Distributions as well as to the Zipf–Mandelbrot Law for the Stock Market

The notion of density of a finite set is introduced. We prove a general theorem of set theory which refines the Gibbs, Bose–Einstein, and Pareto distributions as well as the Zipf law.

متن کامل

Evolutionary Music and the Zipf-Mandelbrot Law: Developing Fitness Functions for Pleasant Music

A study on a 220-piece corpus (baroque, classical, romantic, 12tone, jazz, rock, DNA strings, and random music) reveals that aesthetically pleasing music may be describable under the Zipf-Mandelbrot law. Various Zipf-based metrics have been developed and evaluated. Some focus on musictheoretic attributes such as pitch, pitch and duration, melodic intervals, and harmonic intervals. Others focus ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIS

دوره 50  شماره 

صفحات  -

تاریخ انتشار 1999